Crowdsourcing for ICD10 Code to Concept Relationships
نویسندگان
چکیده
In this work we leverage crowdsourcing in connection with machine learning techniques to validate candidate ICD10 Code to UMLS concept relationships that we generate. Our immediate use is in natural language understanding and machine learning approaches to automatically code electronic health record documents with ICD codes. Beyond auto-coding, the relationships will aid a wide variety of future medical applications, such as terminology-driven search in support of smart medical assistants.
منابع مشابه
LIMSI ICD10 coding Experiments on CépiDC Death Certificate Statements
We describe LIMSI experiments in ICD10 coding of death certificate statements with the CépiDc dataset of the CLEF eHealth 2016 Track 2. We tested a classifier with humanly-interpretable output, based on IR-style ranking of candidate ICD10 diagnoses. A tf.idf-weighted bagof-feature vector was built for each training set code by merging all the statements found for this code in the training data....
متن کاملPerform Three Data Mining Tasks with Crowdsourcing Process
For data mining studies, because of the complexity of doing feature selection process in tasks by hand, we need to send some of labeling to the workers with crowdsourcing activities. The process of outsourcing data mining tasks to users is often handled by software systems without enough knowledge of the age or geography of the users' residence. Uncertainty about the performance of virtual user...
متن کاملFormalizing ICD coding rules using Formal Concept Analysis
BACKGROUND With the 11th revision of the International Classification of Disease (ICD) being officially launched by the World Health Organization (WHO), the significance of a formal representation for ICD coding rules has emerged as a pragmatic concern. OBJECTIVES To explore the role of Formal Concept Analysis (FCA) on examining ICD10 coding rules and to develop FCA-based auditing approaches ...
متن کاملECSTRA-INSERM @ CLEF eHealth2016-task 2: ICD10 Code Extraction from Death Certificates
This paper describes the participation of ECSTRA-INSERM team at CLEF eHealth 2016, task 2.C. The task involves extracting ICD10 codes from death certificates, mainly described with short plain texts. We cast the task as a machine learning problem involving the prediction of the ICD10 codes (categorical variable) from the raw text transformed into a bag-of-words matrix. We rely on probabilistic ...
متن کاملCrowdsourcing Question-Answer Meaning Representations
We introduce Question-Answer Meaning Representations (QAMRs), which represent the predicate-argument structure of a sentence as a set of question-answer pairs. We also develop a crowdsourcing scheme to show that QAMRs can be labeled with very little training, and gather a dataset with over 5,000 sentences and 100,000 questions. A detailed qualitative analysis demonstrates that the crowd-generat...
متن کامل